Usage and Attribution of Stack Overflow Code Snippets in GitHub Projects
نویسندگان
چکیده
Stack Overflow (SO) is the largest Q&A website for software developers, providing a huge amount of copyable code snippets. Using those snippets raises maintenance and legal issues. SO’s license (CC BY-SA 3.0) requires attribution, i.e., referencing the original question or answer, and requires derived work to adopt a compatible license. While there is a heated debate on SO’s license model for code snippets and the required attribution, little is known about the extent to which snippets are copied from SO without proper attribution. We present results of a large-scale empirical study analyzing the usage and attribution of non-trivial Java code snippets from SO answers in public GitHub (GH) projects. We followed three different approaches to triangulate an estimate for the ratio of unattributed usages and conducted two online surveys with software developers to complement our results. For the different sets of projects that we analyzed, the ratio of projects containing files with a reference to SO varied between 3.3% and 11.9%. We found that at most 1.8% of all analyzed repositories containing code from SO used the code in a way compatible with CC BY-SA 3.0. Moreover, we estimate that at most a quarter of the copied code snippets from SO are attributed as required. Of the surveyed developers, almost one half admitted copying code from SO without attribution and about two thirds were not aware of the license of SO code snippets and its implications.
منابع مشابه
Are Code Examples on an Online Q&A Forum Reliable?
Programmers often consult an online Q&A forum such as Stack Overflow to learn new APIs. This paper presents an empirical study on the prevalence and severity of API misuse on Stack Overflow. To reduce manual assessment effort, we design Maple, an API usage mining approach that extracts patterns from over 380K Java repositories on GitHub and subsequently reports potential API usage violations in...
متن کاملSOTorrent: Reconstructing and Analyzing the Evolution of Stack Overflow Posts
Stack Overflow (SO) is the most popular question-and-answer website for software developers, providing a large amount of code snippets and free-form text on a wide variety of topics. Like other software artifacts, questions and answers on SO evolve over time, for example when bugs in code snippets are fixed, code is updated to work with a more recent library version, or text surrounding a code ...
متن کاملStack Overflow Considered Harmful? The Impact of Copy&Paste on Android Application Security
Online programming discussion platforms such as Stack Overflow serve as a rich source of information for software developers. Available information include vibrant discussions and oftentimes ready-to-use code snippets. Previous research identified Stack Overflow as one of the most important information sources developers rely on. Anecdotes report that software developers copy and paste code sni...
متن کاملGitHub and Stack Overflow: Analyzing Developer Interests Across Multiple Social Collaborative Platforms
Increasingly, software developers are using a wide array of social collaborative platforms for software development and learning. In this work, we examined the similarities in developer’s interests within and across GitHub and Stack Overflow. Our study finds that developers share common interests in GitHub and Stack Overflow; on average, 39% of the GitHub repositories and Stack Overflow questio...
متن کاملSupporting Comparison of Developer Profiles across Online Communities
Current software development practices leave a plethora of activities that are archived in version control systems, issue trackers, mailing lists, or Question and Answer (Q&A) forums. Software managers are increasingly using these online activities to better evaluate job candidates. We introduce our tool, Visual Resume, that displays visual overviews of developers’ contributions in code sharing...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1802.02938 شماره
صفحات -
تاریخ انتشار 2018